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Abstract 


The sequence of the S gene of a field canine coronavirus (CCoV), strain Elmo/02, revealed low nucleotide (61%) and amino acid 
(54%) identity to reference CCoV strains. The highest correlation (77% nt and 81.7% aa) was found with feline coronavirus type I. A 
PCR assay for the S gene of strain Elmo/02 detected analogous CCoVs of different geographic origin, all which exhibited at least 
92-96% nucleotide identity to each other and to strain Elmo/02. The evident genetic divergence between the reference CCoV strains 
and the newly identified Elmo/02-like CCoVs strongly suggests that a novel genotype of CCoV is widespread in the dog population. 


© 2003 Elsevier Science B.V. All rights reserved. 
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1. Introduction 


Canine coronavirus (CCoV) is an enveloped, positive 
stranded RNA virus of dogs associated with moderate 
to severe enteritis in young pups. The genome contains 
two large open reading frames (ORFs), Ja and /b, 
encoding two polyproteins leading to the viral replicase 
formation. Downstream to the ORF/b, there are 8—10 
smaller ORFs encoding for the structural proteins S 
(ORF2), E (ORF4), M (ORFS) and the nucleocapsid 
(N) protein (Enjuanes et al., 2000). 

The small membrane protein (E) has been found 
recently to be important for viral envelope assembling 
(Raamsman et al., 2000). The M protein is a type III 
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glycoprotein consisting of a short amino-terminal ecto- 
domain, a triple-spanning transmembrane domain, and 
a long carboxyl-terminal inner domain (Rottier, 1995). 
The ORF2 encodes for a glycosilated protein (S) ranging 
from 1160 to 1452 amino acids (aa) in length (Enjuanes 
et al., 2000), constituting the large, petal-shaped spikes 
on the surface of the virion. This large protein can be 
divided into three structural domains. The large external 
domain at the N-terminus is divided further into two 
subdomains S1 and S2. The S1 sub-domain includes the 
N-terminal half of the molecule and forms the globular 
portion of the spikes. It contains sequences that are 
responsible for binding to specific receptors on the 
membrane of susceptible cells. S1 sequences are variable, 
containing various degrees of deletions and substitutions 
in different coronavirus strains or isolates. Mutations in 
the S1 region have been associated with altered anti- 
genicity and pathogenicity. In contrast, S2 sequences are 
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more conserved and contain two heptad repeat motifs 
that suggest a coiled-coil structure (Lai and Holmes, 
2001). 

On the basis of phylogenetic analysis and antigenic 
cross reactivity, three groups can be distinguished in the 
Coronaviridae family. Group I includes CCoV, the 
transmissible gastroenteritis virus of swine (TGEV), 
the porcine epidemic diarrhoea virus (PEDV), the 
porcine respiratory coronavirus (PRCoV), the feline 
coronaviruses (FCoVs) and the human coronavirus 
229E (HCoV 229E). FCoVs can be distinguished into 
two serotypes, I and H, on the basis of a virus 
neutralization assay in vitro using both type-specific 
feline sera and monoclonal antibodies directed against 
the S protein (Herrewegh et al., 1998). In the field, 
FCoVs type I are predominant and FCoVs type II are 
detected only sporadically. Differences in the S gene of 
FCoVs type I and that of FCoVs type II may also 
account for the different properties observed in vitro, as 
indeed FCoVs type I grow poorly in tissue culture cells 
(Pedersen et al., 1984) while type II strains grow well. 

In a previous study, sequence analysis of CCoVs 
detected in faecal samples collected from dogs with 
diarrhoea revealed multiple nucleotide substitutions 
accumulating over a fragment of the M gene (Pratelli 
et al., 2001). A genetic drift to FCoV type II was also 
observed in the sequence of CCoVs detected in the 
faeces of two pups infected naturally during the late 
stages of long-term viral shedding. It was thus hypothe- 
sized that (1) the dogs might have been infected by a 
mixed population of genetically different CCoVs, or (ii) 
the viruses detected in both the pups were the result of 
mutation/recombination events (Pratelli et al., 2002b). 

Subsequently, extensive sequence analysis on multiple 
regions of the viral genome, including ORF/a, ORF/b 
and ORFS, of several CCoV positive faecal samples 
provided strong evidence for the existence of two 
separate genetic clusters of CCoV. The first cluster 
includes CCoVs intermingled with reference CCoV 
strains, such as Insavc-l1 and K378, while the second 
cluster segregates separately from CCoVs and, presum- 
ably, represents a genetic outlier referred to as FCoV- 
like CCoV (Pratelli et al., 2003). 

The aim of the present study was to evaluate the 
genetic differences between the FCoV-like and the 
‘typical’ CCoVs in the sequence of the gene encoding 
for the S protein. 


2. Materials and methods 
2.1. Faecal samples 
Twenty faecal samples, collected in four kennels in 


Southern Italy from 2—6 month-old pups affected with 
diarrhoea, were tested. Three of the kennels were sited in 


different areas of Puglia, about 50 km from each other, 
while the fourth shelter was located in Abruzzo, more 
than 400 km far from the other three. The faecal samples 
were stored at —20 °C until tested. All the samples were 
negative by a haemoagglutination test for canine 
parvovirus and positive for CCoV when submitted to 
a PCR assay targeting a fragment of the M gene 
(primers CCoV1—CCoV2) (Pratelli et al., 1999). The 
presence of FCoV-like CCoVs in the same samples was 
detected by means of a differential PCR assay, using 
primers (CCoV/a—CCoV2) able to recognise nucleotide 
substitutions conserved in the M gene across all the 
FCoV-like CCoVs (Pratelli et al., 2002a). The sequence 
of the primers and their positions in the M gene are 
shown in Table 1. 


2.2. PCR on the S gene 


Comparative sequence analysis of the S gene have 
revealed a higher degree of variation at the N-terminus 
rather than at the C-terminus of the S protein (Jacobs et 
al., 1987; Motokawa et al., 1996; Horsburgh and Brown, 
1995; Wesley, 1999). Taking into account the sequence 
drift to FCoVs observed in the M gene, we designed a 
pair of primers, UCD1IF—UCDIR, amplifying a 502 bp 
fragment at the very 3’ end of the sequence of the S gene 
of FCoVs type I (strains UCD1, KU-2 and Black), 
which encodes for the highly conserved C-terminus of 
the spike protein. All the 20 canine samples were tested 
with this primer pair and yielded an amplicon of the 
expected size. The sequence of the amplicons obtained 
was determined by direct sequencing of the PCR 
products and displayed 81-82% nucleotide identity to 
FCoV type I strains. 


2.3. Determination of the sequence of the S gene of 
FCoV-like CCoV Elmo/02 


To verify the extent of genetic variation between the 
two clusters of CCoV in the S gene, we determined the 
nearly-full length sequence of the ORF2 of one (Elmo/ 
02) of the samples that had tested positive to FCoV-like 
CCoV. 

Degenerate primers were designed with the CODE- 
HOP strategy (Rose et al., 1998), using a wide selection 
of coronaviruses belonging to group I of the Coronavir- 
idae. This strategy is based on the identification of 
blocks of homology in the amino acid sequences of 
distantly related organisms. Hybrid oligonucleotides, 
with a short 3’ degenerate core region and a longer 5’ 
consensus clamp region, are selected by retro-translation 
on the blocks individuated. CODEHOP sense primers 
with low degeneracy index were selected to amplify 
overlapping fragments of ORF2. One-step reverse 
transcription and PCR amplification were carried out 
using SuperScript™ One-Step RT-PCR for Long Tem- 
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Table | 
Primers used for PCR amplification and sequence analysis 


Primer Gene Viruses* Sequence 5’ to 3’ Sense Position Amplicon size 

CCovl> M CCoV, FCoV-like CCoV, TCCAGATATGTAATGTTCGG + 6729-6748" 409 bp 
FCoV type II 

CCov2° TCTGTTGAGTAATCACCAGCT - 7138-7118° 

CCoVlaS M FCoV-like CCoV, GTGCTTCCTCTTGAAGGTACA + 6900-69204 239 bp 
FCoV type II 

CCov2° TCTGTTGAGTAATCACCAGCT - 7138-71184 

UCDIF S&S FCoV-like CCoV CAGAATGGGAAGAAGTGACG + 3704—3723° 502 bp 

UCDIR CACACATACCAAGGCCATTTT - 4185—4205° 

VIF S FCoV-like CCoV AAGGACGAGTGCACCGACTAYAAYATHTA + 2092-2120° 1698 bp 

VIR TGCATACGTGTCATTAACACAA - 3738—3759° 

V2F S FCoV-like CCoV GACGGCTTCTCCTTCAACAAYTGGTTYHT + 868—896° 1420 bp 

V2R CAGCAGCTTGAGCAGTTAAATC - 2257 -2278° 

V3F S FCoV-like CCoV GTTTCTGATGCTATTAGTACTGTTTCC + 3274-3300" 744 bp 

V3R ACCTTCAGTAAAATCTGGAATTGTG - 3993—4017° 


* FCoV type I has not been tested. 
> Pratelli et al. (1999). 
© Pratelli et al. (2002a). 


4 Primers position is referred to the sequence of CCoV strain Insave-1 (accession: D13096). 
* Primers position is referred to the sequence of FCoV type I strain UCD1 (accession: AB088222). 


plates (Life Technologies, Invitrogen. Milan, Italy). To 
select against any CCoV-like virus during PCR ampli- 
fication, reverse primers specific for the S gene of FCoV- 
like CCoV Elmo/02 were used in combination with the 
forward degenerate primers. Reference CCoV strains, 
S378, K378, 45/93 (Buonavoglia et al., 1994), USDA, 1/ 
71 and SE, were used as controls in the PCR reactions, 
to verify that no CCoV was amplified by the primers. 
The amplicons were cloned into pCR®2.1-TOPO® 
vectors (TOPO TA Cloning®, Invitrogen, Milan, Italy) 
and the recombinant clones individuated by blue/white 
screening. Plasmid DNA was extracted and subjected to 
sequence analysis (Genome Express: Labo Grenoble, 


France). Following this strategy, the fragments of the 
FCoV-like virus Elmo/02 were inserted into the clones. 
A consensus of the sequences obtained was determined 
and the overlapping fragments were manually edited. 
Alignments and sequence analysis were performed using 
the BIOEDIT software package (Hall, 1999). The guide- 
lines of the strategy used to determine the sequence of 
the S gene of the FCoV-like CCoV are schematised in 
Fig. 1. The position and sequence of the degenerate 
primers are reported in Table 1. The nucleotide sequence 
of Elmo/02 will appear in the DDBJ/EMBL/GenBank 
databases under accession no. AY170345. The amino 
acid sequence of the FCoV-like CCoV Elmo/02 was 


— — <+— >; 


V2R UCDIF UCDIR 


1420bp 


Fig. 1. Outlines of the strategy followed to determine the sequence of ORF2 of strain Elmo/02. Dashed arrows indicate the degenerate primers. The 
position of the other primer pair used in this study, V3F—V3R, is also reported. 
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Table 2 
Amino acid identities between the spike protein of group I coronaviruses 
Elmo/02 KU-2. UCDI1 79-1146 79-1683 Insave K378 — Purdue RM-4 CV777 VNot I-tk 
Elmo/02 / 
KU-2 80.81 / 
UCDI1 81.76 91.57 / 
79-1146 54.31 53.71 53.62 / 
79-1683 53.88 53.19 53.02 97.34 / 
Insave 54.31 53.62 53.45 94.93 95.44 / 
K378 54.31 53.45 53.36 95.62 95.96 95.36 / 
Purdue 54.05 53.71 53.45 93.98 94.24 92.52 93.38 = / 
RM-4 53.62 53.28 53.1 92.6 92.52 91.23 91.92 96.56 / 
CV777 52.67 52.93 53.88 54.25 53.24 54.39 53.88 54.65 54.91 / 
VNot I-tk 50.69 51.38 51.98 53.28 54.31 52.87 53.19 53.19 53.28 55.68 / 


Values indicate the arithmetic average x extrapolated from the amino acid matrix of comparison and are expressed in percentage. 


inferred and aligned with a selection of coronaviruses of 
the group I. Phylogenetic and molecular evolutionary 
analyses were conducted using MEGA version 2.1 
(Kumar et al. 2001) and PAUP version 4.0b (Swofford, 
1998). A parsimony tree was elaborated using a heuristic 
algorithm and supplying statistical support by boot- 
strapping over 100 replicates. 


2.4. Analysis of field samples with a PCR specific for the 
S gene of FCoV-like CCoVs 


The primer pair, V3F—V3R, designed on the sequence 
of virus Elmo/02, was chosen to selectively amplify 
FCoV-like CCoVs. The sequences and positions of the 
primers are shown in Table 1. All the samples previously 
characterised as FCoV-like by two separate primer pairs 
targeted to the M gene (CCoV/a—CCoV2) (Pratelli et 
al., 2002a) and to the S gene (UCDIF—UCDIR) were 
screened with the new primers specific for virus Elmo/ 
02. The RNA was reverse transcribed with random 
hexamers using MuLV Reverse Transcriptase (Applied 
Biosystems, Roma, Italy) and then amplified with 
AmpliTagq DNA_ polymerase (Applied Biosystems, 
Roma, Italy), by 40 cycles at 94°C for 1 min, 55 °C 
for 1 min and 72 °C for 1 min. 

To assess the intra-genotypic variability in the S gene 
of FCoV-like CCoVs, four strains, each representative 
of a different geographical area, were selected and 
subjected to sequence analysis. 


3. Results 


All the 20 faecal samples characterised previously as 
FCoV-like CCoVs were recognised by the primer pair 
UCDIF-UCDIR, yielding an amplicon of the expected 
size of 502 bp. 

About 80% (3347 nucleotides) of the ORF encoding 
for the S protein of strain Elmo/02 was determined. 


Using the ORF2 of strain UCDI1 as a reference 
sequence, the fragment sequences between nt 868 and 
4205 and between aa 300 and 1401 may be approxi- 
mately localised. 

The highest nucleotide identity was to FCoV type I 
strains KU-2, UCD1 and Black (~77%), whereas 
identity to FCoVs type H and CCoVs was about 61%. 
Comparison of the inferred amino acid sequences 
revealed 80.81—81.76% identity to FCoVs type I, 
53.88—54.31% to FCoVs type HI and 54.31% to reference 
CCoV strains (Table 2). In accordance with previous 
observations, the sequence of the S protein was much 
more conserved at the C-terminus rather than at the N- 
terminus. For instance, amino acid identity to the best- 
matching sequence (strain KU-2) ranged from 73.39 to 
88.4% and to strain Insave-1 from 41.4 to 65.51% in the 
N- and C-terminus, respectively. 

The inferred amino acid sequence of the S protein of 
strain Elmo/02 is shown in Fig. 2. Similar to other 
coronaviruses, there were several potential N-glycosila- 
tion sites, Asn—X—Ser (NXS) or Asn—X—Thr (NXT). 
Most of the glycosilation sites were conserved between 
strain Elmo/02 and feline/CCoVs, in particular with 
respect to the most closely related FCoVs type I. 
Interestingly, a potential cleavage site, the stretch of 
basic amino acids, Arg—Arg—Ala—Arg—Arg (RRARR), 
was found. The basic stretch is about at the same 
position as in the S protein of group II and group HI 
coronaviruses, but it is absent in the S protein of all the 
other group I coronaviruses. 

Parsimony analysis on the S protein of group I 
coronaviruses revealed that strain Elmo/02 is much 
more related to FCoVs type I rather than to typical 
CCoVs. Conversely, typical CCoVs tightly segregate 
with FCoVs type II and the porcine coronaviruses 
TGEV and PRCoV (Fig. 3). 

The new pair of primers, V3F—V3R, specific for 
strain Elmo/02, successfully amplified all the 20 samples 
tested, yielding an expected PCR product of 744 bp. The 
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Fig. 2. Alignment of the deduced amino acid sequence of the S protein of the FCoV-like strain Elmo/02 with reference CCoVs (Insave-1, K378 and 5821), FCoVs type I (UCD1, Black and KU-2), 
FCoVs type II (79-1146 and 79-1683), TGEVs (Purdue and Miller), PRCoV (RM4), PEDV (CV777) and HCoV-229 (vNotI-tk). The potential glycosilation sites (*) and the putative cleavage site (v ) 
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Fig. 3. Parsimony phylogenetic tree of the S protein of group I coronaviruses. The tree is drawn to scale and rooted using the vaccinal strain vNot-tk 
of HCoV-229. Sequence for the coronavirus sequences reported is available from Gen Bank under the following accession numbers. CCoVs: Insavc- 
1, D13096; K378, X77047; C54, A22886; 5821, ABO17789. FCoVs type II: 79-1146, X06170; 79-1683, X80799. FCoVs type I: KU-2: D32044; Black, 
AB088223; UCD1, AB088222. TGEV: Miller, $51223; Purdue, X05695. PRCoV: RM4, Z24675. PEDV: CV777, NC_00346. HCoV: vNotl-tk, 


NC_002645.1. 


sequence of the V3F—V3R amplicon of four strains 
representative of different geographical areas was de- 
termined, revealing a nucleotide variability of 4—8%. 


4. Discussion 


Genetic divergence within the coronavirus group I is 
accounted for by linear evolution as well as by a sudden, 
dramatic shift generated by RNA deletions or recombi- 
nation. For instance, the S protein of PEDV occupies an 
intermediate position between HCoV 229E and TGEV 
(Kocherhans et al., 2001), while the S protein of PRCoV 
is closely related to TGEV but has a large deletion in the 
N-terminus (more than 200 aa) that may explain the 
change in the pathobiology of the virus (Vaughn et al., 
1994), 

Comparative sequence analysis of the genome of 
FCoVs type I and type H and CCoV has demonstrated 
that FCoV type II has arisen from a template switch 
between FCoV type I and CCoV, which took place 
between the S and M genes. An additional template 
switch has been mapped in the ORF/® region for strain 
FCoV 79-1146 and in the ORF/ab region for strain 
FCoV 79-1683. The double recombination event deter- 


mined the introduction of a large genome fragment, 
encompassing the CCoV-like S-encoding gene, into the 
background of a FCoV genome (Herrewegh et al., 
1998). 

The S gene of CCoV is closely related to FCoVs type 
II, TGEVs and PRCoVs, and is more divergent from 
FCoVs type I, PEDVs and HCoV 229E (Wesseling et 
al., 1994). So far, little evidence has been provided for 
genetic drifts or shifts affecting CCoV. Wesley (1999) 
has described a canine strain displaying a_ higher 
sequence identity to TGEV in the N-terminus of the S 
protein, explained as a possible recombination between 
CCoV and TGEYV, and related to improved growth in 
swine testicular cells. The findings in the present study 
clearly indicate that a novel CCoV type, highly diver- 
gent from the reference CCoV strains, and more closely 
related to FCoVs type I, circulates among dogs. Indeed, 
by means of RT-PCR, Elmo/02-like strains were suc- 
cessfully detected in all the samples tested. All the 
samples had been characterised as ‘atypical’ CCoVs 
when screened with a RT-PCR targeted to the M gene 
and able to distinguish between the two genetic lineages 
previously identified (Pratelli et al., 2002a). Extensive 
sequence analysis of multiple regions in the ORF/a and 
1b, as well in the M-encoding gene, has confirmed the 
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existence of a distinct genetic lineage of CCoV, evolu- 
tionarily localised between CCoV and FCoV. Many 
amino acid residues observed in the M protein of FCoV- 
like CCoVs are the same as in FCoVs and presumably 
represent a retention of the sequence of an ancestral 
virus (Pratelli et al., 2003). Re-considering these data in 
the light of the findings of the present study and 
considering the analogies with closely related viruses, 
we have concluded that the extent of genetic variation 
observed within the CCoVs is limited in the ORF/a and 
slightly greater in the ORF/b and ORFS, though it still 
accounts for a clear pattern of segregation into a distinct 
genetic lineage. The two genotypes of CCoV diverge 
dramatically in the ORF2, where there is more than 
38.4% nucleotide and about 45.5% amino acid variation 
from reference CCoVs. Analysis of the S gene of the 
Elmo/02-like CCoVs revealed a little degree of variation 
(4—8%), which may be explained by their different 
geographical origin. The majority of the sequence 
changes observed are conservative, demonstrating that 
there is some heterogeneity in the ORF2 of Elmo/02-like 
CCoVs. In conclusion, the findings suggest that the two 
canine genotypes underwent a linear evolution rather 
than a sudden shift originating from a recombinant 
event analogous to those leading to the appearance of 
FCoVs type II. Finally, recombination with an ancestral 
coronavirus from which FCoVs type I and Elmo/02-like 
CCoVs directly evolved may not be excluded. 

Whether the Elmo/02-like CCoVs have phenotypic 
properties different from those of typical CCoVs, 
similarly to FCoVs type I and H, will be interesting to 
evaluate. The high divergence in the amino acid 
composition and the loss and gain of potential glycosi- 
lation sites, compared to the most closely related 
coronaviruses (FCoV type I, FCoV type II and typical 
CCoV), strongly suggest that the Elmo/02 strain is 
poorly correlated antigenically with the other corona- 
viruses of dogs and cats. Moreover, the presence of the 
stretch of basic residues RRARR is indicative of a 
potential cleavage of the protein (Wesseling et al., 1994). 
A similar basic motive is present, approximately in the 
same position, in all the coronaviruses identified to date 
of both group II and III, but it is absent in all the 
coronaviruses of group I. Cleavage of the S protein of 
coronaviruses has been correlated to cell-fusion activity 
in vitro (Hingley et al., 1998) but the potential implica- 
tions in viral pathobiology have not been determined. 

On the basis of the significant genetic differences 
between the reference and the Elmo/02-like CCoVs our 
tentative proposal is to designate the new genotype 
identified as CCoV type I, and to designate the reference 
strains, such as Insavce-1 and K378, as CCoVs type II. 
This new designation does not take into account the 
order of discovery of the viruses, but it is based on the 
genetic similarity between CCoVs type I] and FCoVs 
type II and between CCoV type I and FCoV type I. 
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